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Engineering novel proteins by transfer of active sites to natural 
scaffolds 

Claudio Vita 



Novel functional proteins have been generated by the transfer 
of active sites to structurally homologous proteins and to new 
structural contexts. The most successful examples of these 
approaches succeeded in providing effective new tools in 
biochemistry and protein chemistry and in suggesting new 
models in drug design. 
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Abbreviations 

CDR complementarity-determining region 
IMDH isopropylmolate dehydrogenase 

Introduction 

Engineering novel proteins exhibiting a predetermined 
fold and a specific function is an exciting prospect of 
protein engineering. Nevertheless, owing to our limited 
understanding of the relations between structure and 
function, we are still a long way from being able to 
design new proteins with new functions at will. A 
traditional approach used to yield new functional proteins 
entailed modifying existing proteins by large scale random 
mutagenesis and (time consuming) screening of individual 
mutants of the protein of interest. An alternative, 'rational 1 
approach to generating new binding or catalytic activities 
may involve the transfer of functional sites to appropriate 
natural structural scaffolds. Early successes, for example 
redesigning the specificity of a DNA-binding protein by 
exchanging the solvent-exposed functional residues of the 
'recognition helix' with those of a homologous protein [1] 
and the swapping of the complementarity-determining 
region (CDR) loops from one antibody to another in 
order to transfer antigen recognition specificity [2], have 
definitively demonstrated that functional sites can be 
transferred from one protein to another, with conservation 
of structural integrity and gain in function. These 
examples had a seminal role in protein engineering and 
stimulated other applications, considered in this review. 
These 'transfers' will be described in three separate 
sections, corresponding to three conceptually different 
approaches: first, transfer of the functional residues from 
one protein to a structurally homologous one; second, 
transfer of a functional peptide sequence to a host protein 
structure (presentation scaffold), without considerations of 
structural homology but with the purpose of limiting the 



sequence flexibility; and third, transfer of a well-ordered 
active site to a different structural context to create a new 
function on an appropriate natural scaffold. 

Transfer of active sites to homologous 
proteins 

Changing substrate specificity 

Serine proteases have been the subject of intensive 
engineering study aimed at changing their specificity 
The examples reported illustrate how this was effectively 
realized by the transfer of appropriate functional elements 
from one protease to another homologous one. Trypsin and 
chymotrypsin have similar tertiary structures, with only 
four amino acid differences in the SI substrate binding 
site (formed by residues 189-195, 215-220 and 225-228). 
Vet trypsin cleaves peptides at arginine and lysine residues 
and chymotrvpsin prefers large hydrophobic residues. 
Replacement of the four divergent residues of the SI site 
of cow trypsin with those of cow chymotrypsin is not 
sufficient to change the specificity for amide hydrolysis [3]. 
Trypsin is converted to a chymotrypsin-like protease, 
however, when two surface loops (residues 185-188 and 
221-225) of chymotrypsin are also exchanged for the 
analogous trypsin loops [3]. These loops are not structural 
components of either the SI binding site or the extended 
substrate binding sites: their effect is not on substrate 
binding but on the rate of the catalytic process, which they 
accelerate. Attempts to convert chymotrypsin to trypsin by 
the same means failed [4]: additional factors are probably 
involved in the substrate discrimination of trypsin and 
chymotrypsin. 

Another example of recruiting substrate specificity from 
one member of a homologous gene family to another 
member by limited amino acid substitutions in the 
immediate vicinity of a bound substrate has been clearly 
demonstrated in the case of the subtilisin from Bacillus 
amyloliquefaciens. For example, the incorporation of Bacillus 
licheliformis substrate specificity into B. amyloliquefaciens 
subtilisin was obtained by exchanging only three residues, 
involved in van der Waals contacts with the substrate, 
with those of the B. licheliformis subtilisin [5]. More 
recently, subtilisin BPN' was mutated to cleave substrates 
containing two consecutive basic (dibasic) residues [6] 
and three basic (tribasic) residues [7 # *]. Mutants were 
designed on the basis of the structure of subtilisin BPN' 
and by considering sequence differences between it and 
the eukaryotic homologs Kex2 and furin, which are known 
to cleave dibasic and tribasic substrates, respectively The 
incorporation of just two acidic residues, found in the 
eukaryotic enzymes and proposed to interact with the 
dibasic substrate, at the positions Pi and P? efficiently 
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shifted the specificity towards basic residues. An additional 
specificity for basic residues at the P 4 position of the 
substrate was then engineered [7**], by introducing a 
subtilisin to furin single substitution at the S4 subsite. The 
novel protease, called furilisin, cleaves tribasic substrates 
(based on the RAKR or KAKR amino acid sequence 
[single-letter code]) with high catalytic efficiency and 
specificity (Figure 1). These studies provide a basic 
example of how to manipulate substrate specificity in a 
modular fashion, thereby creating a new enzyme that may 
be a useful tool to cleave at engineered tribasic linker 
sequences between proteins and fused affinity tags. 



Figure 1 



(S2) 




(S1) 



Model of an Arg-Ala-Lys-Arg peptide bound to funlisin. The novel 
protease contains only three acidic substitutions, Asp166, Asp62 
and Asp104 from furin S1, S2 and S4 sites, that are supposed to 
interact with the basic residues of the Pl(Arg), P2(Lys) and P4 (Arg) 
sites, respectively, of the substrate. Interestingly, the cumulate effect 
of incorporating acidic residues in three separate enzyme subsites 
had a substantial synergistic effect on the specifity for basic residues. 
(Adapted from [7 M ].) 



Changing coenzyme specificity 

The first successful examples of a change in coenzyme 
specificity was reported for the glutathione reductase [8,9] 
and in the lipoamide dehydrogenase [10] from Escherichia 
colt, two dehydrogenases containing the typical Rossman 
fold. The natural preference of the first for NADP and 
of the second for NAD was inverted by limited amino 
acid substitutions (exchange of critical functional residues) 
in the coenzyme-binding domain on the basis of the 
known structure of the two enzymes. Inverting the NAD 



preference of the Thermus thermophilics isopropylmalate 
dehydrogenase (IMDH), however, presented a singular 
challenge [11**]. Comparison with the NADP-dependent 
E. coli isocitrate dehydrogenase revealed that a P turn 
in the NAD-binding pocket of IMDH is replaced by 
an a helix in IDH; thus, success critically depended 
on being able to engineer secondary- structures in the 
enzyme binding site. Accordingly, the 13-residue a helix 
of the E. coli isocitrate dehydrogenase was modeled in 
place of the seven residues constituting the P turn in the 
T. thermophilus IMDH: four substitutions were suggested 
by this model to avoid steric packing problems, together 
with four additional substitutions to stabilize the binding 
pocket. The engineered dehydrogenase showed a shift of 
preference from NAD to NADP by a factor of 100,000 
fold [ll**]. This example demonstrates that the active 
site transfer strategy had to be combined with structure 
modeling to provide novel properties in some enzymes. 

Changing binding activity 

Engineering a new binding activity by the transfer of 
functional loops appears to be more straightforward than 
an intervention in the active site of an enzyme. The 
Cys 2 His 2 zinc finger represents a particularly attractive 
motif for protein engineering, since it is well characterized 
structurally and has distinct DNA binding properties 
specified by the solvent exposed region on the helix 
part of the molecule. Controlled DNA-binding properties 
have been introduced in the zinc-finger framework by 
substituting seven residues of the solvent exposed face of 
the helix with those taken from different zinc fingers [12]. 
This and other work reporting the display of the zinc* 
finger module on phages has permitted the derivatization 
of specificity rules for the design of new DNA binding 
proteins (reviewed in [13]). 

By transferring eight functional residues of human growth 
hormone to human prolactin, Wells and co-workers [14] 
engineered a prolactin able to bind to the growth 
hormone receptor. Such hybrid hormones could be useful 
as receptor agonists to separate receptor binding and 
activation processes. The eight residues to be transferred, 
however, were not determined solely on a structural 
basis, but the choice was made after seven rounds of 
site-directed mutagenesis and functional assays. This 
work is important because it demonstrates the feasibility 
of recruiting receptor-binding properties from distantly 
related and functionally divergent hormones, and also 
emphasizes that a detailed functional analysis is crucial to 
guide the design of a protein-protein interface. 

The structural homology between angiogenin and ri- 
bonuclease A has been at the basis of interesting 
loop exchange experiments [15,16], demonstrating that 
unrelated activities can be endowed in a suitable structural 
platform simply by the transfer of a single loop. A 
13 residue surface loop of angiogenin has been replaced 
with the corresponding 15-residue loop of pancreatic 
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nbonuclease A [15]; the in vivo angiogenic potency of 
the hybrid is markedly decreased, while the enzymatic 
(ribonuclease) activity is dramatically increased. In a 
complementary experiment, the same 15-rcsiduc surface 
loop of ribonuclease A has been substituted by the 
corresponding 13-residue loop of angiogenin [16]: the 
hybrid showed angiogenic activity comparable to authentic 
angiogenin and a reduced ribonuclease activity. 

Another, more recent, successful example of loop trans- 
fer illustrates the potential for changing the receptor 
specificity of a protein ligand. Basic fibroblast growth 
factor, presenting a p-barrel structure similar to that of 
interleukin-ip, was made to bind with high affinity to the 
receptor for acidic fibroblast growth factor, simply by re- 
placing a five-residue surface loop with the corresponding 
seven-residue loop from the structural homologue acidic 
fibroblast growth factor [17]. 

Transfer of functional sequences to 
presentation scaffolds 

Functional peptide sequences have been transferred, 
mainly as insertions, to permissive regions of a globular 
protein as a way to limit conformational flexibility and to 
present the sequence in a controlled structural context. 
Antigenic epitopes, for example, were inserted within 
different recipient bacterial proteins as a means to induce 
antibodies against a chosen peptide sequence, without 
the need to synthesize the peptide. These genetic con- 
structions, more homogeneous than the hybrids obtained 
by chemical coupling and easily purified by affinity 
chromatography using a specific affinity of the carrier 
protein, provide stimulation of T cells through T-cell-spe- 
cific determinants [18-20]. Antigenic sequences were also 
inserted in the antibody CDRs, as a means of restricting 
their conformation and of providing an immunogenic 
response [21,22]; this process, called 'antigenization' of 
antibodies, is conceptually different and opposite from 
that called 'humanization' of antibodies [2], which implies 
the exchange of CDR loops from one antibody to another 
in order to transfer antigen recognition specificity and to 
reduce the immune response. 

The amino acid sequence RGD (single-letter code) is 
present in a number of cell adhesion proteins (fibronectin, 
vitronectin, von Willebrand factor and fibrinogen) and is 
used for recognition by cell surface receptors (integrins). 
The concept of installing this simple tripeptide sequence 
in different proteins of known three-dimensional structure 
has been used by many groups as a way of fixing a 
conformation that may determine increased or specific 
binding to integrin receptors. The RGD motif, included 
in a sequence of varying length, was inserted into a long 
solvent-exposed loop of lysozyme; the construct possessed 
a new cell adhesion activity [23]. In a subsequent design 
[24], the adhesive sequence was flanked by two cysteine 
residues to form a disulfide bridge and to restrict further 
the conformation of the inserted sequence; the new 



lysozyme hybrid showed a cell adhesion activity, which 
was approximately 5-10% of vitronectin activity [24]. 
Analysis of the three-dimensional structure of this con- 
struct revealed that the RGD region was well defined 
and assumed a type IT p turn conformation, with the 
arginine and aspartic acid sidechains pointing in opposite 
directions; this conformation was suggested to be essential 
for binding to integrin with high affinity [25]. Two 
other groups [26,27] used the framework of IgG to 
present the RGD sequence for integrin binding and used 
the CDR3 loops for the insertion, obtaining molecules 
showing high affinity for the fibrinogen receptor. X-ray 
three-dimensional structure analysis of one of these 
constructions [28] revealed that the RGD region was well 
defined with a 'turn-extended-turn* conformation, and 
with the arginine and aspartate acid sidechains pointing 
in opposite directions, a conformation not too dissimilar 
from that shown by the conformational^' constrained loop 
inserted in human lysozyme [25]. The authors of [28] 
found that the RGD conformation had features in common 
with that of constrained peptide and nonpeptide RGD 
mimetics, known to bind to the fibrinogen receptor. This 
work confirms the utility of structural information derived 
from sequences installed on presentation scaffolds in 
the search for effective peptidomimetic templates, and 
validates the approach of presentation scaffolds in drug 
discovery. 

An interesting application of a small and well-structured 
protein, the chymotrypsin inhibitor 2 from barley seeds, 
as a presentation scaffold of a Glnio repeat has been 
recently reported [29]. The sequence was inserted into a 
long and flexible solvent-exposed loop of the scaffold; this 
insertion caused the molecule to form dimers and trimers 
by association of the glutamine repeats in p-pleated sheets. 
These results are in agreement with the hypothesis that 
such repeats, by linking long glutamine stretches by 
hydrogen bonds and provoking protein aggregation, are 
the cause of inherited neurodegenerative diseases, like the 
Huntington and Kennedy diseases [29]. 

Transfer of active sites to new structural 
contexts to create novel proteins 

Few examples have been reported of the transfer of 
well-ordered active sites to structurally unrelated proteins 
with conservation of the structure and function of the 
transferred sites. Hynes et a/. [30] were the first to 
demonstrate that a p turn of the staphylococcal nuclease 
(residues 27-31) can be substituted by a turn sequence 
(residues 160-165) from concanavalin A and that, in the 
resulting hybrid protein, the guest turn sequence retained 
the conformation present in the parent concanavalin A 
structure. No structural homology is present between 
staphylococcal nuclease and concanavalin A and the 
transfer operation was performed uniquely on the basis 
of a good alignment between the p strands leading 
away from the turn in the guest and in the host 
protein. This experiment clearly suggests that P turns 
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may be transferred from one protein context to another as 
structural 'cassettes'. In a similar approach, the inhibitory- 
loop from the soybean trypsin inhibitor (Kunitz) and 
from other protease inhibitors was transferred into the 
interleukin-ip structure in place of a tight turn: this 
transfer operation conferred on the chimeric cytokine 
specific protease sensitivity or inhibition [31]. 

In a more recent example, a p hairpin, representing part of 
the site by which curare-mimetic snake neurotoxins bind 
nicotinic acetylcholine receptors, has been transferred to 
the oi/p fold of the scorpion charybdotoxin (Figure 2) 
[32**], a scaffold particularly stable and permissive for 
sequence mutations [33]. The resulting chimeric protein 
binds to the acetylcholine receptor, although with a 
relatively low affinity. Structure resolution of this chimera 
by *H-NMR [34] revealed that the transferred site is 
stabilized by the structural scaffold in a conformation 
similar to that present in the parent neurotoxin, suggesting 
that the strategy of active site transfer to a stable scaffold 



has general applications in the engineering of novel ligands 
for membrane receptors. 

In another example, the transfer of a biologically active 
loop to a new structural context resulted in an active ligand 
for the platelet integrin 0tnbp3 [35"]. The CDR3 loop of a 
monoclonal antibody selected to bind the integrin receptor 
a IIbp3 w »th nanomolar affinity was grafted onto the 
epidermal growth factor-like module of human tissue-type 
plasminogen activator. Specifically, the guest loop was 
grafted within a disulfide-stabilized P turn exposed on the 
protein surface. The resulting chimeric protein bound the 
platelet receptor with nanomolar affinity and retained full 
enzymatic activity. Since the transferred loop sequence 
was derived from that of an antibody subjected to 'affinity 
maturation' using a phage display system [36], these 
results suggest that phage display can be combined with 
loop transfer to direct proteins to selected biological 
targets. This approach will eliminate the need to create 
new libraries for each protein studied. 



Figure 2 




Three-dimensional structure of (a) the curaremimetic neurotoxin toxin a and (b) the scorpion scaffold charybdotoxin. The representation 
emphasizes the centra) loop of the neurotoxin that has been transferred to the scorpion scaffold to engineer the curaremimetic chimera. The 
figure was generated from the 1nea (toxin a) and 2crd (charybdotoxin) coordinates in the Brookhaven protein data bank, using the software 
Molscript. (Adapted from [32*'].) 
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Conclusions 

The studies described in this review have demonstrated 
that transfer of active sites to natural scaffolds is ef- 
fective in producing new functional proteins, represent- 
ing interesting models useful in the understanding of 
structural determinants of specificity in substrate-enzyme, 
coenzyme-enzyme, DNA-protein, ligand-receptor and 
protein-protein interactions. In some cases, the new 
proteins, because of their new specificities, have been 
shown to be valuable tools in biochemistry and protein 
chemistry, and to represent new templates useful in drug 
design and discovery. 

Furthermore, the concept of using a new structural context 
to induce a well-defined conformation of a specified 
sequence or to stabilize the structure of a predetermined 
active site seems to be quite promising. The results 
obtained in these transfer operations clearly suggest that, 
in the future, it should be possible to construct a family 
of well-defined structural domains presenting a wide set 
of structural secondary and tertiary motifs, onto which 
active sequences can be transferred and new functions 
generated. Stably expressed on the surface of phage, the 
transferred sequences can be rapidly changed, on the basis 
of screening assays, to increase their biological potency 
to yield conformationally well-defined artificial proteins 
active against specific biological targets. 
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